{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "post_training_integer_quant.ipynb",
      "version": "0.3.2",
      "provenance": [],
      "private_outputs": true,
      "collapsed_sections": [],
      "toc_visible": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_DDaAex5Q7u-",
        "colab_type": "text"
      },
      "source": [
        "##### Copyright 2019 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "W1dWWdNHQ9L0",
        "colab_type": "code",
        "colab": {},
        "cellView": "form"
      },
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "6Y8E0lw5eYWm"
      },
      "source": [
        "# Post-training integer quantization"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "CIGrZZPTZVeO"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://www.tensorflow.org/lite/performance/post_training_integer_quant\"><img src=\"https://www.tensorflow.org/images/tf_logo_32px.png\" />View on TensorFlow.org</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/g3doc/performance/post_training_integer_quant.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "BTC1rDAuei_1"
      },
      "source": [
        "## Overview\n",
        "\n",
        "[TensorFlow Lite](https://www.tensorflow.org/lite/) now supports\n",
        "converting an entire model (weights and activations) to 8-bit during model conversion from TensorFlow to TensorFlow Lite's flat buffer format. This results in a 4x reduction in model size and a 3 to 4x performance improvement on CPU performance. In addition, this fully quantized model can be consumed by integer-only hardware accelerators.\n",
        "\n",
        "In contrast to [post-training \"on-the-fly\" quantization](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/lite/tutorials/post_training_quant.ipynb)\n",
        ", which only stores weights as 8-bit ints, in this technique all weights *and* activations are quantized statically during model conversion.\n",
        "\n",
        "In this tutorial, you train an MNIST model from scratch, check its accuracy in TensorFlow, and then convert the saved model into a Tensorflow Lite flatbuffer\n",
        "with full quantization. Finally, check the\n",
        "accuracy of the converted model and compare it to the original saved model. The training script, `mnist.py`, is available from the\n",
        "[TensorFlow official MNIST tutorial](https://github.com/tensorflow/models/tree/master/official/mnist).\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "2XsEP17Zelz9"
      },
      "source": [
        "## Build an MNIST model"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "dDqqUIZjZjac"
      },
      "source": [
        "### Setup"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "gyqAw1M9lyab",
        "colab": {}
      },
      "source": [
        "! pip uninstall -y tensorflow\n",
        "! pip install -U tf-nightly"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "WsN6s5L1ieNl",
        "colab": {}
      },
      "source": [
        "import tensorflow as tf\n",
        "tf.enable_eager_execution()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "00U0taBoe-w7",
        "colab": {}
      },
      "source": [
        "! git clone --depth 1 https://github.com/tensorflow/models"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "4XZPtSh-fUOc",
        "colab": {}
      },
      "source": [
        "import sys\n",
        "import os\n",
        "\n",
        "if sys.version_info.major >= 3:\n",
        "    import pathlib\n",
        "else:\n",
        "    import pathlib2 as pathlib\n",
        "\n",
        "# Add `models` to the python path.\n",
        "models_path = os.path.join(os.getcwd(), \"models\")\n",
        "sys.path.append(models_path)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "eQ6Q0qqKZogR"
      },
      "source": [
        "### Train and export the model"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "eMsw_6HujaqM",
        "colab": {}
      },
      "source": [
        "saved_models_root = \"/tmp/mnist_saved_model\""
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "hWSAjQWagIHl",
        "colab": {}
      },
      "source": [
        "# The above path addition is not visible to subprocesses, add the path for the subprocess as well.\n",
        "# Note: channels_last is required here or the conversion may fail. \n",
        "!PYTHONPATH={models_path} python models/official/mnist/mnist.py --train_epochs=1 --export_dir {saved_models_root} --data_format=channels_last"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "5NMaNZQCkW9X"
      },
      "source": [
        "For the example, you train the model for just a single epoch, so it only trains to ~96% accuracy."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "xl8_fzVAZwOh"
      },
      "source": [
        "### Convert to a TensorFlow Lite model\n",
        "\n",
        "The `savedmodel` directory is named with a timestamp. Select the most recent one: "
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Xp5oClaZkbtn",
        "colab": {}
      },
      "source": [
        "saved_model_dir = str(sorted(pathlib.Path(saved_models_root).glob(\"*\"))[-1])\n",
        "saved_model_dir"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "AT8BgkKmljOy"
      },
      "source": [
        "Using the [Python `TFLiteConverter`](https://www.tensorflow.org/lite/convert/python_api), the saved model can be converted into a TensorFlow Lite model.\n",
        "\n",
        "First load the model using the `TFLiteConverter`:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "_i8B2nDZmAgQ",
        "colab": {}
      },
      "source": [
        "import tensorflow as tf\n",
        "tf.enable_eager_execution()\n",
        "tf.logging.set_verbosity(tf.logging.DEBUG)\n",
        "\n",
        "converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)\n",
        "tflite_model = converter.convert()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "F2o2ZfF0aiCx"
      },
      "source": [
        "Write it out to a `.tflite` file:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "vptWZq2xnclo",
        "colab": {}
      },
      "source": [
        "tflite_models_dir = pathlib.Path(\"/tmp/mnist_tflite_models/\")\n",
        "tflite_models_dir.mkdir(exist_ok=True, parents=True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Ie9pQaQrn5ue",
        "colab": {}
      },
      "source": [
        "tflite_model_file = tflite_models_dir/\"mnist_model.tflite\"\n",
        "tflite_model_file.write_bytes(tflite_model)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "7BONhYtYocQY"
      },
      "source": [
        "To instead quantize the model on export, first set the `optimizations` flag to optimize for size:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "HEZ6ET1AHAS3",
        "colab": {}
      },
      "source": [
        "tf.logging.set_verbosity(tf.logging.INFO)\n",
        "converter.optimizations = [tf.lite.Optimize.DEFAULT]"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "rTe8avZJHMDO",
        "colab_type": "text"
      },
      "source": [
        "Now, construct and provide a representative dataset, this is used to get the dynamic range of activations."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FiwiWU3gHdkW",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "mnist_train, _ = tf.keras.datasets.mnist.load_data()\n",
        "images = tf.cast(mnist_train[0], tf.float32)/255.0\n",
        "mnist_ds = tf.data.Dataset.from_tensor_slices((images)).batch(1)\n",
        "def representative_data_gen():\n",
        "  for input_value in mnist_ds.take(100):\n",
        "    yield [input_value]\n",
        "\n",
        "converter.representative_dataset = representative_data_gen"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xW84iMYjHd9t",
        "colab_type": "text"
      },
      "source": [
        "Finally, convert the model like usual. By default, the converted model will still use float input and outputs for invocation convenience."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "yuNfl3CoHNK3",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "tflite_quant_model = converter.convert()\n",
        "tflite_model_quant_file = tflite_models_dir/\"mnist_model_quant.tflite\"\n",
        "tflite_model_quant_file.write_bytes(tflite_quant_model)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "PhMmUTl4sbkz"
      },
      "source": [
        "Note how the resulting file is approximately `1/4` the size."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "JExfcfLDscu4",
        "colab": {}
      },
      "source": [
        "!ls -lh {tflite_models_dir}"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "L8lQHMp_asCq"
      },
      "source": [
        "## Run the TensorFlow Lite models"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "-5l6-ciItvX6"
      },
      "source": [
        "Run the TensorFlow Lite model using the Python TensorFlow Lite\n",
        "Interpreter. \n",
        "\n",
        "### Load the test data\n",
        "\n",
        "First, let's load the MNIST test data to feed to the model:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "eTIuU07NuKFL",
        "colab": {}
      },
      "source": [
        "import numpy as np\n",
        "_, mnist_test = tf.keras.datasets.mnist.load_data()\n",
        "images, labels = tf.cast(mnist_test[0], tf.float32)/255.0, mnist_test[1]\n",
        "\n",
        "mnist_ds = tf.data.Dataset.from_tensor_slices((images, labels)).batch(1)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Ap_jE7QRvhPf"
      },
      "source": [
        "### Load the model into the interpreters"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "Jn16Rc23zTss",
        "colab": {}
      },
      "source": [
        "interpreter = tf.lite.Interpreter(model_path=str(tflite_model_file))\n",
        "interpreter.allocate_tensors()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "J8Pztk1mvNVL",
        "colab": {}
      },
      "source": [
        "interpreter_quant = tf.lite.Interpreter(model_path=str(tflite_model_quant_file))\n",
        "interpreter_quant.allocate_tensors()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "2opUt_JTdyEu"
      },
      "source": [
        "### Test the models on one image"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "AKslvo2kwWac",
        "colab": {}
      },
      "source": [
        "for img, label in mnist_ds:\n",
        "  break\n",
        "\n",
        "interpreter.set_tensor(interpreter.get_input_details()[0][\"index\"], img)\n",
        "interpreter.invoke()\n",
        "predictions = interpreter.get_tensor(\n",
        "    interpreter.get_output_details()[0][\"index\"])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "XZClM2vo3_bm",
        "colab": {}
      },
      "source": [
        "import matplotlib.pylab as plt\n",
        "\n",
        "plt.imshow(img[0])\n",
        "template = \"True:{true}, predicted:{predict}\"\n",
        "_ = plt.title(template.format(true= str(label[0].numpy()),\n",
        "                              predict=str(predictions[0])))\n",
        "plt.grid(False)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "3gwhv4lKbYZ4",
        "colab": {}
      },
      "source": [
        "interpreter_quant.set_tensor(\n",
        "    interpreter_quant.get_input_details()[0][\"index\"], img)\n",
        "interpreter_quant.invoke()\n",
        "predictions = interpreter_quant.get_tensor(\n",
        "    interpreter_quant.get_output_details()[0][\"index\"])"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "CIH7G_MwbY2x",
        "colab": {}
      },
      "source": [
        "plt.imshow(img[0])\n",
        "template = \"True:{true}, predicted:{predict}\"\n",
        "_ = plt.title(template.format(true= str(label[0].numpy()),\n",
        "                              predict=str(predictions[0])))\n",
        "plt.grid(False)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "LwN7uIdCd8Gw"
      },
      "source": [
        "### Evaluate the models"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "05aeAuWjvjPx",
        "colab": {}
      },
      "source": [
        "def eval_model(interpreter, mnist_ds):\n",
        "  total_seen = 0\n",
        "  num_correct = 0\n",
        "\n",
        "  input_index = interpreter.get_input_details()[0][\"index\"]\n",
        "  output_index = interpreter.get_output_details()[0][\"index\"]\n",
        "\n",
        "  for img, label in mnist_ds:\n",
        "    total_seen += 1\n",
        "    interpreter.set_tensor(input_index, img)\n",
        "    interpreter.invoke()\n",
        "    predictions = interpreter.get_tensor(output_index)\n",
        "    if predictions == label.numpy():\n",
        "      num_correct += 1\n",
        "\n",
        "    if total_seen % 500 == 0:\n",
        "      print(\"Accuracy after %i images: %f\" %\n",
        "            (total_seen, float(num_correct) / float(total_seen)))\n",
        "\n",
        "  return float(num_correct) / float(total_seen)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "T5mWkSbMcU5z",
        "colab": {}
      },
      "source": [
        "# Create smaller dataset for demonstration purposes\n",
        "mnist_ds_demo = mnist_ds.take(2000)\n",
        "\n",
        "print(eval_model(interpreter, mnist_ds_demo))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Km3cY9ry8ZlG"
      },
      "source": [
        "Repeat the evaluation on the fully quantized model to obtain:"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "colab_type": "code",
        "id": "-9cnwiPp6EGm",
        "colab": {}
      },
      "source": [
        "# NOTE: Colab runs on server CPUs. At the time of writing this, TensorFlow Lite\n",
        "# doesn't have super optimized server CPU kernels. For this reason this may be\n",
        "# slower than the above float interpreter. But for mobile CPUs, considerable\n",
        "# speedup can be observed.\n",
        "# Only use 2000 for demonstration purposes\n",
        "print(eval_model(interpreter_quant, mnist_ds_demo))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "L7lfxkor8pgv"
      },
      "source": [
        "In this example, you have fully quantized a model with no difference in the accuracy."
      ]
    }
  ]
}